7 Quality Check: Processing AED. Empadronados
7.1 Input
The table to be analysed is aed.csv.
7.2 Check variables
The variables extracted from aed are: sip, fecha_registro, fecha_alta, dpto_cod, centro_cod, circ_alta_cod, circ_alta_desc, motivo_urg_cod, motivo_urg_desc, diag_cod, diag2_cod, prioridad_cod, prioridad_desc, tipo_codigo1, and tipo_codigo2.
7.2.1 Check mandatory vars
All mandatory vars are present.
7.2.2 Check all vars
fecha_alta_admin, dpto_desc, centro_desc, diag_desc, and diag2_desc were not extracted.
7.2.3 Completeness
In Figure 7.1 is shown the percentage of non-missing values for each variable. Non-mandatory variables are shown at the bottom of the figure.
7.3 Check content
The aed table has a total of n = 7 142 464 observations.
7.3.1 Population
- In aed table there are 1 241 653 distinct individuals. All the individuals are included in the target population. Therefore, there are 1 241 653 individuals included in the target population out of the 1 842 818 total individuals in the cohort. These represents 67.38% of the total.
- The Table 7.1 shows the number of individuals per year of the study period.
| Year of admission | Count of distinct individuals |
|---|---|
| 2009 | 300284 |
| 2010 | 280111 |
| 2011 | 279791 |
| 2012 | 256086 |
| 2013 | 271501 |
| 2014 | 285637 |
| 2015 | 330202 |
| 2016 | 344792 |
| 2017 | 344593 |
| 2018 | 342717 |
| 2019 | 352752 |
| 2020 | 287026 |
| 2021 | 326614 |
| 2022 | 200 |
7.3.2 Date of the admission
The variable fecha_registro is missing in 0 observations, so it is 100% complete. The minimum and maximum date are 2009-01-01 and 2022-01-01 respectively. Table 7.2 shows the number of admissions per year of fecha_registro.
There are dates outside the study period. From the non-missing dates:
100% are inside the study period.
0% occurred before the start of the study period.
0% occurred after the end of the study period.
| Year of the admission | Count |
|---|---|
| 2009 | 530703 |
| 2010 | 496300 |
| 2011 | 490872 |
| 2012 | 454297 |
| 2013 | 480004 |
| 2014 | 512651 |
| 2015 | 594314 |
| 2016 | 628301 |
| 2017 | 624208 |
| 2018 | 621110 |
| 2019 | 640362 |
| 2020 | 492889 |
| 2021 | 576250 |
| 2022 | 203 |
The month and year with less admissions was January 2022 with n = 203 and the month and year with more admissions was July 2021 with n = 57648.
In Figure 7.2, Figure 7.3, and Figure 7.4 are presented the frequencies of years, months, and days of the admissions respectively.
7.3.3 Date of the discharge
The variable fecha_alta is missing in 1028167 observations, so it is 85.6% complete. The minimum and maximum date are 2009-01-01 and 2022-03-24 respectively. Table 7.3 shows the number of discharges per year of fecha_alta.
There are dates outside the study period. From the non-missing dates:
100% are inside the study period.
0% occurred before the start of the study period.
0% occurred after the end of the study period.
| Year of the discharge | Count |
|---|---|
| 2009 | 221696 |
| 2010 | 298642 |
| 2011 | 369416 |
| 2012 | 396751 |
| 2013 | 396434 |
| 2014 | 481902 |
| 2015 | 566635 |
| 2016 | 596556 |
| 2017 | 599233 |
| 2018 | 580852 |
| 2019 | 607696 |
| 2020 | 462936 |
| 2021 | 535284 |
| 2022 | 264 |
| NA | 1028167 |
The month and year with less discharges was February 2022 with n = 1, March 2022 with n = 1 and the month and year with more discharges was NA NA with n = 1028167.
In Figure 7.5, Figure 7.6, and Figure 7.7 are presented the frequencies of years, months, and days of the visits respectively.
#> Warning: Removed 1 rows containing missing values (`position_stack()`).
#> Warning: Removed 1 rows containing missing values (`position_stack()`).
7.3.4 Visit service
The variable motivo_urg_desc is missing in 0 observations, so it is 100% complete. Table 7.4 shows all the services used in the primary care visits arranged by alphabetic order. Figure 7.8 shows the count of the utilization of each visit service. Finally, Figure 7.9 shows the count of visits for the 10 most used services per year.
| Service | Count | Percentage |
|---|---|---|
| Accidente casual | 660352 | 9.25% |
| Accidente de trabajo | 66408 | 0.93% |
| Accidente de tráfico | 159141 | 2.23% |
| Agresión | 48185 | 0.67% |
| Autolesión | 17667 | 0.25% |
| COVID-sospecha | 50453 | 0.71% |
| Caso Relacionado con Gripe A | 2658 | 0.04% |
| Enfermedad común | 5524973 | 77.35% |
| Otras causas | 584057 | 8.18% |
| [Sin referencia] | 28552 | 0.40% |
| [Vacío] | 18 | 0.00% |
7.3.5 Diagnoses codes: diag_cod
The variable diag_cod is missing in 0 observations, so it is 100% complete. Figure 7.10 shows the most employed diagnoses codes. Finally, Figure 7.11 shows the count of the 10 most employed codes per year.
7.3.6 Diagnoses codes: diag2_cod
The variable diag2_cod is missing in 0 observations, so it is 100% complete. Figure 7.10 shows the most employed diagnoses codes. Finally, Figure 7.11 shows the count of the 10 most employed codes per year.
7.3.7 Code vocabulary: tipo_codigo1
The variable tipo_codigo1 is missing in 0 observations, so it is 100% complete. Figure 7.14 shows the count of the utilization of each visit service. Finally, Figure 7.15 shows the count of visits for the 10 most used services per year.
7.3.8 Code vocabulary: tipo_codigo2
The variable tipo_codigo2 is missing in 0 observations, so it is 100% complete. Figure 7.16 shows the count of the utilization of each visit service. Finally, Figure 7.17 shows the count of visits for the 10 most used services per year.